WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 
H04N7/26 



Al 



(11) Internatkmal Publication Number: WO 97/15146 

(43) International Publication Date: 24 April 1997 (24D4.97) 



(21) International Application Number: PCT/IB96701099 

(22) International Flung Date: 17 October 1996 (17.10.96) 



(30) Priority Data: 

95202819.9 18 October 1995 (18.10.95) EP 

(34) Countries for which the regional or 

international application was filed: NL et al. 



(81) Designated States: JP, US, European patent (AT. BE, CH, DE, 
DK, ES, FT, FR, GB, GR, IE, IT, LU, MC. NL, PT, SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(71) Applicant (for all designated States except US): PHILIPS 
ELECTRONICS N.V. (NL/NL); Grocncwoudscwcg 1, NL- 
5621 BA Eindhoven (NL). 

(71) Applicant (for SE only): PHILIPS NORDEN AB [SE/SE]; 

Kottbygatan 7 V Kista, S-164 85 Stockholm (SE). 

(72) Inventors; and 

(75) Inventors/Applkants (for US only): BEUKER, Rob, Anne 
[NL/NL]; Prof. Holstlaan 6, NL-5656 AA Eindhoven (NL). 
THEUNIS. Hendrik, Gemmualdus, Jacobus [NL/NL]; Prof. 
Holstlaan 6, NL-5656 AA Eindhoven (NL). HEUSDENS, 
Richard [NL/NL]; Prof. Holstlaan 6, NL-5656 Eindhoven 
(NL). 

(74) Agent: SCHMTTZ, Herman, J., R.; Intemationaa) Octrooibu- 
reau B.V., P.O. Box 220, NL-5600 AE Eindhoven (NL). 



(54) Title: METHOD OF ENCODING VIDEO IMAGES 



(57) Abstract 



A method of encoding video images is disclosed, in which different coding methods are applied to different regions of the image. 
The image is divided into blocks, and the coding method which is optima] in a rate-distortion sense is selected (2) for each block. In 
an embodiment, transform coding (3), such as DCT or LOT, is applied to all blocks. The block size is selected in accordance with a 
rate-distortion criterion. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AM 


Armenia 


GB 


United Kingdom 


MW 


Malawi 


AT 


Austria 


GE 


Georgia 


MX 


Meaico 


AU 


Australia 


GN 




NE 


Niger 


BB 


Barbados 


GR 


Greece 


NL 


Netherlands 


BE 




HI) 


Hungary 


NO 


Norway 
New Zealand 


BP 


Burkina Faso 


IE 


Ireland 


NZ 


BG 


Bulgaria 


IT 


Rah* 


PL 


Poland 


BJ 


Benin 


JP 


Japan 


FT 


Portugal 


BR 


Brazil 


KE 


Kenya 


RO 


Romania 


BY 


Belarus 


KG 


Kyrgystan 


RU 


Russian Federation 


CA 


Canada 


KP 


Democratic Peopfe't Republic 


SD 


Sodas 


CP 


Cental African Republic 




of Korea 


SE 


Sweden 


CC 


Congo 


KB 


Republic of Korea 


SG 


Singapore 


CH 


Switzerland 


KZ 


Kazakhstan 


SI 


Slovenia 


a 


Ctitc d'fvoire 


LI 




SK 


Slovakia 


CM 


Cameroon 


LK 


Sri Lanka 


SN 


Senegal 


CN 


China 


LR 


Liberia 


sz 


Swaziland 


CS 


Czechoslovakia 


LT 




TD 


Chad 


CZ 


Czech Republic 


LU 


Luxembourg 


TG 


Togo 


DE 


Gcnnasy 


LV 


Latvia 


TJ 


Tapkistan 


OK 


Denmark 


MC 


Monaco 


TT 


Trinidad and Tobago 


EE 


Estonia 


MD 


Republic of Moldova 


UA 


Ukraine 


BS 


Spain 


MC 


Madagascar 


UG 


Uganda 


n 


Finland 


ML 


Mali 


US 


United States of Ame 


FR 


France 


MN 


Mongolia 


UZ 


Uzbekistan 


GA 


Gabon 


MR 


Mauritania 


VN 


Viet Nam 



WO 97/15146 

Method of encoding video images. 



1 



PCT/1B96/01099 



FIELD OF THE INVENTION 

The invention relates to a method of encoding video images, comprising 
the steps of dividing said images into blocks selecting one of a plurality of different coding 
methods for each of said blocks and encoding said blocks using the selected coding method 
to obtain coded data for each block. The invention also relates to an arrangement for 
5 carrying out said encoding method. 

BACKGROUND OF THE INVENTION 

A method of encoding video images as described in the opening paragraph 
is disclosed in European Patent Application EP-A 0 220 706. In this known method, 
10 transform coding is applied to each block, the block size being variable in response to 

brightness changes. The blocks are subdivided into smaller blocks so that the mean distortion 
inside each block does not exceed an allowable value. 

OBJECT AND SUMMARY OF THE INVENTION 

^ It is an object of the invention to further improve the video image 

encoding method. 

To this end, the method according to the invention is characterized in that 
the step of selecting the encoding method comprises the determination of that coding method 
which is optimal in a rate-distortion sense. An optimal compromise between rate and 

20 distortion is thereby achieved. 

In an embodiment of the method, the plurality of different coding methods 
is applied to pixel blocks of equal size. Examples of different coding methods are transform 
coding and fractal coding. In a further embodiment, the coding methods are all picture 
transforms, but they are applied to pixel blocks of different block sizes. Transforms used in 

25 transform coding are the Discrete Cosine Transform (DCT), the Hadamard transform, the 
Lapped Orthogonal Transforms (LOT), in particular the Modified LOT (MLOT), all known 
in the art. 
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In a preferred embodiment of the method it is assumed that the statistics 
of the image to be coded are Gaussian, and that the transform coefficients are uncorrected. 
In this embodiment, the rate and distortion, on which the selection of the optimal transform 
type is based, can easily be calculated. 

5 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows a diagram of a video encoding and transmitting station 
employing the method according to the invention. 

Fig. 2 shows examples of rate-distortion curves associated with different 

10 coding methods. 

Fig.3 shows a flow chart of steps carried out by a segmentation circuit 
which is shown in Fig.l. 

Fig. 4 shows a segmentation map of an image indicating the different 
coding methods applied to different regions of the image. 

15 

DESCRIPTION OF PREFERRED EMBODIMENTS 

Fig. 1 shows a diagram of a video encoding and transmitting station 
employing the method according to the invention. The arrangement receives a video input 
signal X^. In an optional subtracting circuit 1, a predicted video signal Xp^, is subtracted 

20 therefrom. The encoder can thus operate in an intraframe mode or a (possibly motion- 
compensated) interframe mode. The picture to be coded is applied to a segmentation circuit 2 
and a transform circuit 3. The segmentation circuit determines, for example in a pre-analysis 
phase, which transform for a given block is optimal in a rate-distortion sense. The circuit 
further merges the contiguous blocks subjected to the same transform so as to form regions 

25 with the same transform. A "segmentation map" thus created is encoded for transmission or 
storage by an encoding circuit 4. 

The segmentation map is further applied to transform circuit 3 so as to 
indicate which transform is to be carried out during the actual coding phase. The transform 
coefficients obtained from transform circuit 3 are quantized and lossless coded by a quantizer 

30 and entropy coder 5. Quantization and entropy coding are well-known in the art. For 
example MPEG2-like coding can be used. The coefficients for each transform block are 
zigzag-scanned. The DC coefficients are quantized using a fixed step size, and encoded 
differentially. The AC coefficients are adaptively quantized and entropy-coded using a 
combination of Huffman coding and run-length coding. An end-of-block code is transmitted 
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after the last non-zero AC coefficient of a block. The coded data thus obtained is multiplexed 
with the encoded segmentation map by a multiplexer 6 and transmitted to a decoder or stored 
on a storage medium (not shown). 

The segmentation circuit 2 determines the optimal coding method in a 
5 rate-distortion sense. The rate-distortion curve of a given coding method is the collection of 
rate-distortion pairs (R,D) for different values of an encoding parameter t, e.g. the 
quantization step size of a transform coder. Fig.2 shows a rate-distortion curve 201 
associated with a first coding method Tl and a second rate-distortion curve 202 associated 
with a second coding method T2. In the following embodiment, transform coding is applied 

10 to pixel blocks of non-equal size. The segmentation circuit 2 determines the optimal block 
size. In the present example, two assumptions are made to speed up the segmentation 
process: the statistics of the image to be coded are Gaussian, and the transform coefficients 
are statistically independent. Under these assumptions, the following applies (see Toby 
Berger: Rate Distortion Theory, A Mathematical Basis For Data Compression, Prentice-hall, 

15 Inc. Englewood Cliffs, New Jersey, 1971, pp. 110-1 11): 

1. For each pixel block k which is processed, the rate 1^(0 and distortion D k (t) is: 

i c 2 

R t (r) - ' X)max(logJi ,0) (1) 

£ i t 

*>*(') =E min(ci,0 (2) 

I 

where c u is the i-th coefficient of transform block k and t is an encoding 
20 parameter, e.g. representative of a quantizer step size. 

2. The slope s of the rate-distortion curve is: 



1 

5 = 

2t 



(3) 



Fig.3 shows a flow chart of steps carried out by segmentation circuit 2. In 
a step 21, the circuit calculates the operating value of t in such a way that the global rate 
25 R(t) equals a required rate R^, i.e. such that: 

k 

The value of t is found, for example, by using a bi-section algorithm. Table I shows an 
example of such a bi-section algoritm in a pseudo-programming language. Of course, more 
efficient algorithms, such as Gradient methods, can be used. 
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Table I 

t, = minimum non-zero value of c u 2 ; 
R, = Rlt,); 

t, = maximum value of c u 2 ; 

R f = R(t,); 

repeat 

R = R((t, + t;)/2); 

if R > 

then t, = (t, ♦ts- 
etse t, = (t,+t,}/2; 
until R « 

t » (t, + t,)/2 

In a step 22, the circuit subjects each pixel block k to a given transform 
so as to obtain transform coefficients c u , and calculates the rate RJt) and distortion D k (t) for 
said block in accordance with equations (1) and (2), using the value t which was found in 
step 21. The step 22 is repeated for different block sizes. In the present example, four 
different transforms are considered: a 2*2 transform Tl, a 4*4 transform T2, an 8*8 
transform T3, or a 16*16 transform T4. In a step 23, it is checked whether or not all these 
transforms have been processed. 

If the rate-distortion pair (R,D) has been calculated for each transform 
type, the best transform is selected in a step 24. The best transform is the transform for 
which the "Lagrangian cost* L, defined as L=R+s.D, is minimal. Herein, s is the slope of 
the rate-distortion curve in accordance with equation (3). An adequate way of selecting the 
best transform is achieved by pair-wise comparing the above transform results, i.e by 
carrying out the following substeps: 

1- Compare, for a 4*4 block, four 2*2 Tl transform blocks with the 

corresponding 4*4 T2 transform block. 

2. Compare, for a 8*8 block, the 8*8 T3 transform with the transform resulting 
from substep 1 for this block. 

3. Compare, for a 16*16 block, the 16*16 T4 transform with the transform 
resulting from substep 2 for this block. 

In a step 25, the selected transform type is stored in the segmentation 
map, which defines a grid determined by the smallest block size. Fig.4 shows an illustrative 
example of such a segmentation map. 

Returning now to Fig.l, the segmentation map is applied to transform 
circuit 3 so as to indicate which transform type is to be used during the phase of really 
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encoding the image. During this encoding process, the rate R^t) for block k as determined in 
step 22 may be applied to a bitrate regulation circuit (not shown in Fig. 1) so as to actually 
achieve the rate as determined by the segmentation circuit 2. Bitrate regulation circuits are 
known in the art. The segmentation map is further applied to encoding circuit 4 for 
transmission to the decoder or storage on a storage medium. A practical encoding strategy is 
to assign a unique number to the different transform types. The transform number is lossless 
encoded, using DPCM. The resultant differences are transmitted by a combination of 
Huffman coding and run-length coding. 

An alternative embodiment for calculating the rate-distortion pairs (step 22 
above) is to actually encode (transform, quantize, Huffman and run-length coding) each 
potential image block k. In that case, the above assumptions (the statistics of the image to be 
coded are Gaussian, and the transform coefficients are uncorrected) are not applicable. 

It is also to be noted that different transforms with equal block sizes can 
be used in the automatic segmentation, for example Discrete Cosine Transforms, Hadamard 
transforms, or Lapped Transforms such as the Modified Lapped Orthogonal Transform. 

It is further to be noted that a provision in the coding process is required 
to switch between the different transforms at the contour between regions, while maintaining 
(near) perfect reconstruction. For example, using linear phase transforms, this can be 
accomplished by mirroring at the region boundaries. 

In summary, a method of encoding video images is disclosed in which 
different coding methods are applied to different regions of the image. The image is divided 
into blocks, and for each block the coding method is selected which is optimal in a rate- 
distortion sense. In an embodiment, transform coding, such as DCT or LOT, is applied to all 
blocks. The block size is selected in accordance with a rate-distortion criterion. 
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1. A method of encoding video images, comprising the steps of dividing said 

images into blocks, selecting one of a plurality of different coding methods for each of said 
blocks, encoding said blocks using the selected coding method to obtain coded data for each 
block, and transmitting data indicating the selected coding method and said coded data, 
5 characterized in that the step of selecting the encoding method comprises the determination of 
that coding method which is optimal in a rate-distortion sense. 



2. A method as claimed in Claim 1, wherein the plurality of different coding 
methods is applied to pixel blocks of equal size. 

3. A method as claimed in Claim 1, wherein the plurality of different coding 
10 methods are signal transforms applied to pixel blocks of different block sizes. 

4. A method as claimed in Claim 3 t wherein the step of determining the 



optimal coding method implies the calculation of the rate R(t) and distortion D(t) in 
accordance with 

i c 7 
**(') ^ 4 E^Oog-^O) 

!5 D t (r) = £ min(ci,/) 

i 

where c a is the i-th coefficient of transform block k and t is a quantization parameter. 

5. An arrangement for encoding video images, comprising means for 
dividing said images into blocks, means for selecting one of a plurality of different coding 
methods for each of said blocks, means for encoding said blocks using the selected coding 

20 method for to obtain coded data for each block, and means for transmitting data indicating 
the selected coding method and said coded data, characterized in that the means for selecting 
the encoding method comprise means for determining which coding method is optimal in a 
rate-distortion sense. 

6. An arrangement as claimed in Claim 5, wherein the plurality of different 
25 coding methods is applied to pixel blocks of equal size. 
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7. An arrangement as claimed in Claim 5, wherein the plurality of different 
coding methods are signal transforms applied to pixel blocks of different block sizes. 

8. An arrangement as claimed in Claim 7, wherein means for determining 
the optimal coding method is adapted to calculate the rate R(t) and distortion D(t) in 
accordance with 

R k (t) = £max(log-if,0) 

i t 

where c iX is the i-th coefficient of transform block k and t is a quantization parameter. 



WO 97/15146 



PCI7IB96/01099 



1/2 




R 



FIG. 2 



WO 97/15146 



PCT/IB96/01099 




1 



INTERNATIONAL SEARCH REPORT 


International application No. 
PCT/IB 96/01099 


A. CLASSIFICATION OF SUBJECT MATTER 



IPC6: H04N 7/26 

Ac cording to International Patent Ctassificatkm (IPC) or to both national daxxtfioiion and IPC 
B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by dassificaiion symbols) 

IPC6: H04N 

Documenution searched other than minimum documentation to the extent that such documents are included in the fields sear ched 

SE,DK,FI,N0 classes as above 

Electronic data base consulted during the mteraattonal search (name of data base and, where practicable, search terms used) 



C DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


Y 


US 5241395 A (CHENG-TIE CHEN), 31 August 1993 

(31.08.93), column 1, line 63 - column 2, line 35; 
column 3, line 12 - line 14 


1-8 


Y 


EP 0549813 Al (SONY CORPORATION), 7 July 1993 

(07.07.93), column 1, line 10 - column 2, line 49 


1-8 


A 


Toby Berger, "Rate Distortion Theory", 1971, 
Prentice-Hall, (New Jersey, USA), 
page 110 - page 111 


4,8 



| xl Further documents are listed in the continuation of Box C | )j See patent family annex. 



Spwial categories of c 



"A* docameni the general state of the art which is not 

to be *f pr»ftw*ii»y rrlf vintT 

"E* griler Amjuii p uI hi* prthH«h— < wi rw> fH*. i**-~**X~*t filfog A*tm 

"L* rtnnimmt which may throw doubts on priority dahnfs) or wtifch is 
died to ***iMifh the ppMicetifTn date of ^ nfftwr ri t i i^m cr other 
special ream (as sprriflrd) 

"O* rtonimnil referring to an oral disclosure, use, TThftriHrm or ether 



*P* do cu men t published prior to the internxiiomd 
the pnority date daimed 



date but later than 



T* lt»r<4ofiim»m prthli »n>r ttu* t n tpyi mn ^i ^ fiH ^ j date or priority 

date and not in conflict wiih the ap nlirafio n but cited to uudgiund 
the principle or theory u nderl yin g the invention 

*X* rtnrimiPiil ft/p«TtMTT<»T y*4nr*T*v tK» rtmnmmtft |p writing etmnt he 

usiailried novel or cannot be considered to involve an inventive 

ftfp %yfac& tfac dOGQsO^Dt IS tlkCO JwlODB 

wloci^scDut OvT pA^bcidAf fdensoc the d^toscd tfltvcs&iQtt c*mmtf fa© 
considered to involve an inventive step when the doenmrm is 
coenfawed with or dofc other such doeusentSt such co^nbt rmtHrn 
bring obvious to a penon stilled in the art 

*Q**^wt#wt member of the same pjrfrrft uunily 



Date of the actual completion of the international search 
27 March 1997 


Date of mailing of the international search report 

0 2 -0V 1997 


Name and mailing address of the ISA/ 
Swedish Patent Office 
Box 5055. S-1Q2 42 STOCKHOLM 
Facsimile No. +46 8 666 02 86 


Authorized officer 
Anders Strobeck 

Telephone No. + 46 8 782 25 00 



Form PCT/lSA/210 (second sheet) (July 1992) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/IB 96/01099 



C (Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to daim No. 



P.A 



IEEE Global Telecommunications Conference 

61obecom D 91, Volume 1, December 1991, (USA) , 
Sullivan et al, "Rate-distortion optimized motion 
compensation for video compression using fixed or 
variable size blocks" 



EP 0220706 A2 (HITACHI, LTD), 6 May 1987 

(06.05.87), page 3, line 26 - page 4, line 10 



US 5506686 A (AUYEUNG ET AL), 9 April 1996 

(09.04.96), column 2, line 56 - column 3, line 35 



3,4,7.8 



1-8 



1-8 



Form PCT /ISA/210 (contimntton of cecoad dim) (July 1993) 



INTERNATIONAL SEARCH REPORT 

Information on patent family members 

04/03/97 


International application No. 
PCT/IB 96/01099 


Patent document 
ctted in search report 


Publication 
date 


Patent teily 
tnsnbcr(s) 


Publication 
date 



US 5241395 A 31/08/93 NONE 



EP 0549813 Al 07/07/93 AU 656215 B 27/01/95 

AU 2330492 A 23/02/93 

CA 2091579 A 20/01/93 

JP 5276500 A 22/10/93 

JP 5276506 A 22/10/93 

US 5543843 A 06/08/96 

HO 9302528 A 04/02/93 



EP 0220706 A2 06/05/87 DE 3686754 A 22/10/92 

JP 8024341 B 06/03/96 

JP 62101183 A 11/05/87 

US 4831659 A 16/05/89 



US 5506686 A 09/04/96 AU 3506195 A 17/06/96 

CA 2178943 A 30/05/96 

EP 0740882 A 06/11/96 

W0 9616508 A 30/05/96 



Fortn PCT/ISA/210 (patent fimfly amies) (July 1992) 



